Method and system for empirical modeling of time-varying, parameter-varying, and nonlinear systems via iterative linear subspace computation

ABSTRACT

Methods and systems for estimating differential or difference equations that can govern a nonlinear, time-varying and parameter-varying dynamic process or system. The methods and systems for estimating the equations may be based upon estimations of observed outputs and, when desired, input data for the equations. The methods and systems can be utilized with any system or process that may be capable of being described with nonlinear, time-varying and parameter-varying difference equations and can used for automated extraction of the difference equations in describing detailed system or method behavior for use in system control, fault detection, state estimation and prediction and adaptation of the same to changes in a system or method.

CROSS-REFERENCE APPLICATIONS

The present invention claims priority under 35 U.S.C. §120 to U.S. Provisional Patent Application No. 61/239,745, filed on Sep. 3, 2009, the disclosure of which is incorporated by reference herein in its entirety.

BACKGROUND

The modeling of nonlinear and time-varying dynamic processes or systems from measured output data and possibly input data is an emerging area of technology. Depending on the area of theory or application, it may be called time series analysis in statistics, system identification in engineering, longitudinal analysis in psychology, and forecasting in financial analysis.

In the past there has been the innovation of subspace system identification methods and considerable development and refinement including optimal methods for systems involving feedback, exploration of methods for nonlinear systems including bilinear systems and linear parameter varying (LPV) systems. Subspace methods can avoid iterative nonlinear parameter optimization that may not converge, and use numerically stable methods of considerable value for high order large scale systems.

In the area of time-varying and nonlinear systems there has been work undertaken, albeit without the desired results. This work is typical of the present state of the art in that rather direct extensions of linear subspace methods are used for modeling nonlinear systems. This approach expresses the past and future as linear combinations of nonlinear functions of past inputs and outputs. One consequence of this approach is that the dimension of the past and future expand exponentially in the number of measured inputs, outputs, states, and lags of the past that are used. When using only a few of each of these variables, the dimension of the past can number over 10⁴ or even more than 10⁶. For typical industrial processes, the dimension of the past can easily exceed 10⁹ or even 10 ¹². Such extreme numbers result in inefficient exploitation and results, at best.

Other techniques use an iterative subspace approach to estimating the nonlinear terms in the model and as a result require very modest computation. This approach involves a heuristic algorithm, and has been used for high accuracy model identification in the case of LPV systems with a random scheduling function, i.e. with white noise characteristics. One of the problems, however, is that in most LPV systems the scheduling function is usually determined by the particular application, and is often very non-random in character. In several modifications that have been implemented to attempt to improve the accuracy for the case of nonrandom scheduling functions, the result is that the attempted modifications did not succeed in substantially improving the modeling accuracy.

In a more general context, the general problem of identification of nonlinear systems is known as a general nonlinear canonical variate analysis (CVA) procedure. The problem was illustrated with the Lorenz attractor, a chaotic nonlinear system described by a simple nonlinear difference equation. Thus nonlinear functions of the past and future are determined to describe the state of the process that is, in turn used to express the nonlinear state equations for the system. One major difficulty in this approach is to find a feasible computational implementation since the number of required nonlinear functions of past and future expand exponentially as is well known. This difficulty has often been encountered in finding a solution to the system identification problem that applies to general nonlinear systems.

Thus, in some exemplary embodiments described below, methods and systems may be described that can achieve considerable improvement and also produce optimal results in the case where a ‘large sample’ of observations is available. In addition, the method is not ‘ad hoc’ but can involve optimal statistical methods.

SUMMARY

One exemplary embodiment describes a method for utilizing nonlinear, time-varying and parameter-varying dynamic processes. The method may be used for generating reduced models of systems having time varying elements. The method can include steps for expanding state space difference equations; expressing difference equations as a linear, time-invariant system in terms of outputs and augmented inputs; and estimating coefficients of the state equations.

Another exemplary embodiment may describe a system for estimating a set of equations governing nonlinear, time-varying and parameter-varying processes. The system can have a first input, a second input, a feedback box and a time delay box. Additionally, in the system, the first input and the second input may be passed through the feedback box to the time delay box to produce an output.

DETAILED DESCRIPTION

Aspects of the present invention are disclosed in the following description and related figures directed to specific embodiments of the invention. Those skilled in the art will recognize that alternate embodiments may be devised without departing from the spirit or the scope of the claims. Additionally, well-known elements of exemplary embodiments of the invention will not be described in detail or will be omitted so as not to obscure the relevant details of the invention.

As used herein, the word “exemplary” means “serving as an example, instance or illustration.” The embodiments described herein are not limiting, but rather are exemplary only. It should be understood that the described embodiments are not necessarily to be construed as preferred or advantageous over other embodiments. Moreover, the terms “embodiments of the invention”, “embodiments” or “invention” do not require that all embodiments of the invention include the discussed feature, advantage or mode of operation.

Further, many embodiments are described in terms of sequences of actions to be performed by, for example, elements of a computing device. It will be recognized that various actions described herein can be performed by specific circuits (e.g., application specific integrated circuits (ASICs)), by program instructions being executed by one or more processors, or by a combination of both. Additionally, these sequence of actions described herein can be considered to be embodied entirely within any form of computer readable storage medium having stored therein a corresponding set of computer instructions that upon execution would cause an associated processor to perform the functionality described herein. Thus, the various aspects of the invention may be embodied in a number of different forms, all of which have been contemplated to be within the scope of the claimed subject matter. In addition, for each of the embodiments described herein, the corresponding form of any such embodiments may be described herein as, for example, “logic configured to” perform the described action.

Generally referring to exemplary FIGS. 1-6, methods and systems for empirical modeling of time-varying, parameter-varying and nonlinear difference equations may be described. The methods and systems can be implemented and utilized to provide for a variety of results and which may be implemented efficiently.

As shown in exemplary FIG. 1, a flow chart of a methodology for empirical modeling of time-varying, parameter varying and nonlinear difference equations according to an exemplary embodiment may be shown. Here, in 102, a set of time-varying, parameter varying and, if desired, nonlinear state space difference equations may be utilized. In 104, the equations may then be expanded with respect to a chosen set of basis functions, for example nonlinear input-output equations may be expanded in polynomials in x_(t) and u_(t). Then, in 106, difference equations may then be expressed as a linear time-invariant, for example, in terms of outputs y_(t) and augmented inputs u_(t), which can include inputs u_(t) and basis functions, for example polynomials, in inputs u_(t) scheduling functions p_(t) and states x_(t).

Exemplary FIG. 2 may show an exemplary flow chart where a linear, parameter varying system of difference equations may be utilized. In this embodiment in 202 a set of linear parameter varying state space equations, such as those shown below as equation 1 and equation 2, may be used.

x _(t+1) =A ₀ x _(t) +B ₀ u _(t) +[A _(l)ρ_(s)(1)+ . . . +A_(s)ρ_(t)(s)]x _(t) +[B ₁ρ_(t)(1)+ . . . +B _(s)σ_(t)(s)]u   (1)

y _(t) =C ₀ x _(t) +D ₀ u _(t) +[C ₁ρ_(t)(1)+ . . . +C _(s)ρ(s)]x _(t) +[D _(l)ρ_(t)(1)+ . . . +D _(s)ρ_(t)(s)]u _(t)   (2)

Then, in 204, the state space difference equations may be expanded with respect to polynomials in the scheduling function p_(t), states x_(t) and inputs u_(t), for example as shown in equations 3 and 4 below.

x _(t+1) =A ₀ x _(t) +B ₀ u _(t) +[A ₁ . . . A_(s)](ρ_(t) {circle around (x)}x _(s))+[B ₁ . . . B_(s)](ρ_(t) {circle around (x)}u _(t))   (3)

y _(t) =C ₀ x _(t) +D ₀ u _(t) +[C ₁ . . . C _(x)](ρ_(t) {circle around (x)}x)+[D ₁ . . . D _(s)](ρ_(t) {circle around (x)}u _(t))   (4)

Next, in 206, the difference equations can be expressed in terms of original outputs y_(t) and augmented inputs [u_(t),(p_(t){circle around (x)}x_(t)), (p_(t){circle around (x)}u_(t))] that are functions of u_(t), x_(t) and p_(t). Difference equations can have linear time-invariant unknown (A₀, [B₀ A_(ρ) B_(ρ)], C₀, [D₀ C_(ρ) D_(ρ)]) coefficients that can be estimated, and as shown in equations 5 and 6 below.

$\begin{matrix} {x_{t + 1} = {{A_{0}x_{t}} + {\left\lbrack {B_{O}A_{\rho}B_{\rho}} \right\rbrack \begin{bmatrix} u_{t} \\ {\rho_{t} \otimes x_{t}} \\ {\rho_{t} \otimes u_{t}} \end{bmatrix}}}} & \left( {{Equation}\mspace{14mu} 5} \right) \\ {y_{t} = {{C_{0}x_{t}} + {\left\lbrack {D_{O}C_{\rho}D_{\rho}} \right\rbrack \begin{bmatrix} u_{t} \\ {\rho_{t} \otimes x_{t}} \\ {\rho_{t} \otimes u_{t}} \end{bmatrix}}}} & \left( {{Equation}\mspace{14mu} 6} \right) \end{matrix}$

Then, as shown in 208, the augmented inputs [u_(t),(ρ_(t){circle around (x)}x_(t)), (ρ_(t){circle around (x)}u_(t))], and, in some exemplary embodiments, specifically ρ_(t){circle around (x)}x_(t), can involve an unknown state x_(t) vector, so iteration may be utilized or desired. Thus, in such exemplary embodiments, iteration using an iterated algorithm, as described in further detail below, may be utilized.

Exemplary FIG. 3 can show a flow chart of an iterated algorithm that may be implemented for iterated subspace identification. In this embodiment, and as seen in 302, nonlinear difference equations can be expanded in additive basis functions and expressed in linear time-invariant form with augmented inputs u_(t). This can include, in some examples, nonlinear basis functions involving outputs, state and scheduling functions.

Then, in 304, the state estimate {circumflex over (x)}₆ ^([0]) is unknown that is equivalent to (A_(ρ)C_(ρ)) term not in the LPV model, so the corresponding terms may be deleted from the set of augmented inputs ũ_(t). The iterated algorithm may then be implemented using augmented inputs as the inputs and can compute estimates {circumflex over (Θ)}^([k]) of the model parameters. Then, using, for example, a Kalman filter at the estimated parameter values {circumflex over (Θ)}^([1]), the state estimates {circumflex over (x)}_(t) ^([1]) can be computed along with the one-step prediction innovations. Then the likelihood function can be evaluated.

Next, in 306, an iteration k for k≧2 may be made. Here the state estimate {circumflex over (x)}_(t) ^([k−1]) may be initialized for all t. The iterative algorithm may then be implemented using the augmented inputs as the inputs and can compute estimates {circumflex over (Θ)}^([k]) of the parameters. Again, for example, using a Kalman filter at the estimated parameter values {circumflex over (Θ)}^([k]), the state estimates {circumflex over (x)}_(t) ^([k]) may be computed. The one-step prediction innovations may also be made and the likelihood ρ(Y_(1:N)|Ũ_(1:N);{circumflex over (Θ)}^([k])) may also be evaluated.

In 308 the convergence can be checked. Here the change in the values of the log likelihood function and the state orders between iteration k−1 and iteration k can be compared. If, in some examples, the state order is the same and the log likelihood function change is less than a chosen threshold ε. This threshold in many examples may be less than one, for example 0.01, then the iterations may end or be stopped. Otherwise, where the value is above a chosen threshold ε, step 306 above may be returned to and iteration k+1 may be performed. Following the performance of iteration k+1, the convergence may then be checked again.

In another exemplary embodiment, a different approach may be taken to directly and simply obtain optimal or desired estimates of the unknown parameters for the case of autocorrelated errors and feedback in the system using, for example, subspace methods developed for linear time-invariant systems. This may be done by expressing the problem in a different form that can lead to a desire to iterate on the state estimate; however, the number of iterations may be very low, and, to further simplify the system and its development, stochastic noise may be removed.

For example, consider a linear system where the system matrices are time varying functions of a vector of scheduling parameters ρ_(t)=(ρ_(t)(1) ρ_(t)(2) . . . ρ_(t)(s))^(T) of the forms as shown in the equations below:

x _(t+1) A(ρ_(t))x _(t) +B(ρ_(t))u _(t)   (7)

y _(t) =C(ρ_(t))x _(t) +D(ρ_(t))u _(t)   (8)

For affine dependence on the scheduling parameters, the state space matrices can have the form of the following equations 9 through 12.

$\begin{matrix} {{A\left( \rho_{t} \right)} = {A_{0} + {\sum\limits_{i = 1}^{s}{{\rho_{t}(i)}A_{i}}}}} & \left( {{Equation}\mspace{14mu} 9} \right) \\ {{B\left( \rho_{t} \right)} = {B_{0} + {\sum\limits_{i = 1}^{s}{{\rho_{t}(i)}B_{i}}}}} & \left( {{Equation}\mspace{14mu} 10} \right) \\ {{C\left( \rho_{t} \right)} = {C_{0} + {\sum\limits_{i = 1}^{s}{{\rho_{t}(i)}C_{i}}}}} & \left( {{Equation}\mspace{14mu} 11} \right) \\ {{D\left( \rho_{t} \right)} = {D_{0} + {\sum\limits_{i = 1}^{s}{{\rho_{t}(i)}D_{i}}}}} & \left( {{Equation}\mspace{14mu} 12} \right) \end{matrix}$

In the above equations, it may be noted that where the matrices on the left hand side are expressed as linear combinations (1 ρ_(t)(1) ρ_(t)(2) . . . ρ_(t)(s))^(T) on the right hand side specified by the scheduling parameters ρ_(t), which may be called an affine form. In further discussion, the notation will be used for the system matrix A=[A₀ A₁ . . . A_(s)]=[A₀ A_(ρ)] and similarly for B, C, and D.

In some further exemplary embodiments, system identification methods for the class of LPV systems can have a number of potential applications and economic value. Such systems can include, but are not limited to, aerodynamic and fluid dynamic vehicles, for example aircraft and ships, automotive engine dynamics, turbine engine dynamics, chemical processes, for example stirred tank reactors and distillation columns, amongst others. One feature can be that at any given operating point pt the system dynamics can be described as a linear system. The scheduling parameters pt may be complex nonlinear functions of operating point variables, for example, but not limited to, speed, pressures, temperatures, fuel flows and the like, that may be known or accurately measured variables that characterize the system dynamics within possibly unknown constant matrices A, B, C and D. It may also be assumed that ρ_(t) may be computable or determinable from the knowledge of any such operating point variables. For example LPV models of automotive engines can involve the LPV state space equations that explicitly express the elements of the vector ρ_(t) as very complex nonlinear functions of various operating point variables. In some exemplary embodiments described herein, it may only be desired that the scheduling parameter ρ_(t) may be available when the system identification computations are performed. This can be a relaxation of the real-time use or requirement for such applications as real-time control or filtering.

To simplify the discussion, the LPV equations can be written in time-invariant form by associating the scheduling parameter ρ_(t) with the inputs u_(t) and states x_(t) as

x _(t+1) =A ₀ x _(t) +B ₀ u _(t) +[A ₁ . . . A_(s)](ρ_(t) {circle around (x)}x _(t))+[B ₁ . . . B_(s)](ρ_(t) {circle around (x)}u _(t))   (13)

y _(t) =C ₀ x _(t) +D ₀ u _(t) +[C ₁ . . . C_(s)](ρ_(t) {circle around (x)}x _(t))+[D₁ . . . D_(s)](ρ_(t) {circle around (x)}u _(t))   (14)

Here, {circle around (x)} can denote the Kronecker product M{circle around (x)}N, defined for any matrices M and N as the partitioned matrix formed from blocks of i,j as (M{circle around (x)}N)_(i,j)=M_(i,j)N with the i,j element of M denoted as m_(i,j). Also, the notation [M;N]=[M^(T) N^(T)]^(T) can be used for stacking the vectors or matrices M and N. Equations 13 and 14 above can then also be written as shown below in the formats of equations 15 and 16.

$\begin{matrix} {x_{t + 1} = {{A_{0}x_{t}} + {\left\lbrack {B_{0}A_{\rho}B_{\rho}} \right\rbrack \begin{bmatrix} u_{t} \\ \left( {\rho_{t} \otimes x_{t}} \right) \\ \left( {\rho_{t} \otimes u_{t}} \right) \end{bmatrix}}}} & (15) \\ {y_{t} = {{C_{0}x_{t}} + {\left\lbrack {D_{0}C_{\rho}D_{\rho}} \right\rbrack \begin{bmatrix} u_{t} \\ \left( {\rho_{t} \otimes x_{t}} \right) \\ \left( {\rho_{t} \otimes u_{t}} \right) \end{bmatrix}}}} & (16) \end{matrix}$

As discussed in more detail below, the above equations can be interpreted as a linear time-invariant (LTI) system with nonlinear feedback of f_(t)=[(ρ_(t){circle around (x)}x_(t));(ρ_(t){circle around (x)}u_(t))] where the states x_(t) and inputs u_(t) can be multiplied by the time varying scheduling parameters ρ_(t). The feedback f_(t) inputs can now be considered as actual inputs to the LTI system. As shown in further detail below, the matrices [A_(ρ) B_(ρ); C_(ρ) D_(ρ)] of the LPV system description can be the appropriate quantities to describe the LTI feedback representation of the LPV system.

Further, the above equations may now be described as shown below in equations 17 and 18.

x _(t+1) Ãx _(t) +{tilde over (B)}ũ _(t)   (17)

y _(t) ={tilde over (C)}x _(t) +{tilde over (D)}ũ _(t)   (18)

Thus for measurements of outputs and inputs ũ_(t)=[u_(t) ^(T)(ρ_(t){circle around (x)}x_(t))^(T)(ρ_(t){circle around (x)}u_(t))^(T)]^(T) the time-invariant matrices can be (Ã, {tilde over (B)}, {tilde over (C)}, {tilde over (D)})=(A₀, [B₀ A_(ρ) B_(ρ)], C₀, [B₀ C_(ρ) D_(ρ)]) respectively. Also, in situations where x_(t) in ρ_(t){circle around (x)}x_(t) may not be a known or measured quantity, a prior estimate of x_(t) may be available or utilized and iterations may be used to obtain a more accurate or desired estimate of x_(t).

In still further exemplary embodiments, an LPV system can be expressed as a linear time-invariant system with nonlinear internal feedback that can involve the known parameter varying functions ρ_(t). (see, Section 2.1 of Nonlinear System Identification: A State-Space Approach, Vincent Verdult, 2002, Ph.D. thesis, University of Twente, The Netherlands, the contents of which are hereby incorporated by reference in their entirety). In this exemplary embodiment the system matrices P_(i) of rank r_(i) may be factored for each i with 1≦i≦s using a singular value decomposition, such as that shown in equation 19.

$\begin{matrix} {P_{i} = {\begin{bmatrix} A_{i} & B_{i} \\ C_{i} & D_{i} \end{bmatrix} = {\begin{bmatrix} B_{f,i} \\ D_{f,i} \end{bmatrix}\left\lbrack {C_{z,i}D_{z,i}} \right\rbrack}}} & \left( {{Equation}\mspace{14mu} 19} \right) \end{matrix}$

The quantities may then be defined as the following, shown in equations 20 through 23.

B _(f) =└B _(f,1) B _(f,2) . . . B _(f,s)┘  (20)

D _(f) =└D _(f,1) D _(f,2) . . . D _(f,s)┘  (21)

C _(z) ^(T) =└C _(z,1) ^(T) C _(z,2) ^(T) . . . C _(z,s) ^(T)┘  (22)

D _(z) ^(T) =└D _(z,1) ^(T) D _(z,2) ^(T) . . . D _(z,s) ^(T)┘  (23)

Next, internal feedback in the LTI system P_(0=[A) ₀B₀;C₀D₀] may be considered with outputs z_(t)=C_(z)x_(t)+D_(z)u_(t) and with nonlinear feedback from z_(t) to f_(t)=ρ_(t){circle around (x)}z_(t) entering the LTI system P₀ state equations through the input matrices B_(f) and D_(f). The state equations for x_(t+1) and y_(t) of the feedback system may be shown below in equations 24 through 27.

x _(t) =A ₀ x _(t) +B ₀ u _(t) +B _(f) f _(t)   (24)

y _(t) =C ₀ x _(t) +D ₀ u _(t) +D _(f) f _(t)   (25)

z_(t) =C _(z) x _(t) +D _(z) u _(t)   (26)

f _(t) =[p _(t) {circle around (x)}z _(t)]  (27)

With respect to the above and referring now to exemplary FIG. 4, if it can be assumed that for any time t there is no effect of the feedback f_(t) on the output z_(t) as in the feedback structure shown in FIG. 4 and in equation 26, then there is no parameter dependence in the linear fractional transformation (LFT) description (see K. Zhou, J. Doyle, and K. Glover (1996), Robust and Optimal Control, Prentice-Hall, Inc., Section 10.2, the contents of which are hereby incorporated by reference in their entirety). As shown in FIG. 4, there may be a system 400 having two boxes, box 404 which may be a memory-less nonlinear feedback system and box 402 that can be a linear time-invariant system. Therefore, in some exemplary embodiments, the parameter dependence can become affine.

Exemplary FIG. 4 may be a schematic diagram of Equations 15 and 16. The state Equation 15 involves the upper boxes in 402 while the measurement Equation 16 involves the lower boxes in 402. ΔT 422 is a time delay of sample duration with the right hand side of Equation 15 at 444 entering 422 and the left hand side equal to state x_(t+1) at leaving. This is a recursion formula similar to equations 14 and 15, so the time index can be changed from “t” to “t+1” for the figure before the start of the next iteration and continuing until entering boxes 420, 430 and 410. Scheduling parameters ρ_(t) 406, inputs u_(t) 408 and outputs y_(t) 446 are variables The upper four boxes are multiplication from left to right by B₀ 418, A₀ 420, B_(ρ) 414, A_(ρ) 416, respectively. Similarly, the lower boxes are multiplication with A replaced by C and B replaced by D, depicted as D_(ρ) 424, C_(ρ) 426, D₀ 428 and C₀ 430. In the feedback box 404, the Kronecker products involving ρ_(t) and successively x_(t) and u_(t) are formed in 410 and 412 respectively. The pairs of boxes aligned vertically are multiplying respectively from left to right the variables x_(t), u_(t), ρ_(t){circle around (x)}x_(t) and ρ_(t){circle around (x)}u_(t). Additionally, arrows shown as touching a wire can symbolize addition. Further, ΔT 422 can represent a time delay block of duration ΔT that can act similar to a date line for this exemplary embodiment. Therefore the arrows in exemplary FIG. 4 indicate a time flow or an actual sequence of operations, the flow may start at 406, 408 and the output of 422 and proceed through the diagram. Upon reaching ΔT 422, all operations for sample t have been performed and, upon crossing through ΔT 422, sample time t+1 may begin. Thus, for example, upon leaving ΔT 422, the same quantity is maintained, but all of the time labels can be changed to t+1 throughout the process shown in exemplary FIG. 4.

Further, as shown below, this can be equivalent to the LPV form shown in equations 15 and 16 where the state equations can be linear in the scheduling parameter vector ρ.

In further exemplary embodiments and now using a definition of feedback defined as f_(t)=ρ_(t){circle around (x)}z, as given by equations 26 and 27, the state equations for x_(t+1) and y_(t) may be as shown in equations 28 and 29 below.

$\begin{matrix} {x_{t + 1} = {{A_{0}x_{t}} + {B_{0}u_{t}} + {\sum\limits_{i = 1}^{S}{B_{f,i}{\rho_{t}(i)}C_{z,i}x_{t}}} + {\sum\limits_{i = 1}^{S}{B_{f,i}{\rho_{t}(i)}D_{z,i}u_{k}}}}} & \left( {{Equation}\mspace{14mu} 28} \right) \\ {y_{t} = {{C_{0}x_{t}} + {D_{0}u_{t}} + {\sum\limits_{i = 1}^{S}{D_{f,i}{\rho_{t}(i)}C_{z,i}x_{t}}} + {\sum\limits_{i = 1}^{S}{D_{f,i}{\rho_{t}(i)}D_{z,i}u_{k}}}}} & \left( {{Equation}\mspace{14mu} 29} \right) \end{matrix}$

Then, if equation 19 is used to define the above factors, equations 28 and 29 may be the same as equations 15 and 16.

Next, in a further exemplary embodiment, as the rank of P_(i) may not be known, the outputs z_(t) may be set as the states and inputs z_(t)=[x_(t) ^(T);u_(t) ^(T)]^(T) that may subsequently be fed back through the static nonlinearity so that f_(t)=[ρ_(t){circle around (x)}x_(t);ρ_(t){circle around (x)}u_(t)]. Then └C_(z,i)E_(z,i)┘=[I_(dim x)0;0I_(dim u)] and equations 30 and 31 may be set as below.

$\begin{matrix} \begin{matrix} {P_{i} = \begin{bmatrix} A_{i} & B_{i} \\ C_{i} & D_{i} \end{bmatrix}} \\ {= {\begin{bmatrix} B_{f,i} \\ D_{f,i} \end{bmatrix}\left\lbrack {C_{z,i}D_{z,i}} \right\rbrack}} \\ {= {{\begin{bmatrix} B_{f,i} \\ D_{f,i} \end{bmatrix}\begin{bmatrix} I_{\dim \; x} & 0 \\ 0 & I_{\dim \mspace{11mu} u} \end{bmatrix}}\left( {{Equation}\mspace{14mu} 31} \right)}} \\ {= \begin{bmatrix} B_{f,i} \\ D_{f,i} \end{bmatrix}} \end{matrix} & \left( {{Equation}\mspace{14mu} 30} \right) \end{matrix}$

Thus, from the above, the LPV coefficient matrix P_(i)=[A_(i) B_(i); C_(i) D_(i)] can be the regression matrix of the left hand side state equation variables (x_(t+1); y_(t)) on the vector of nonlinear terms [ρ_(i,t) x_(t); ρ_(i,t) u_(t)]. More generally, the LPV coefficient matrix P_(ρ)=[A_(ρ) B_(ρ); C_(ρ) D_(ρ)]=[B_(f), D_(f)] can be the regression matrix of the left hand side state equation variables (x_(t+1); y_(t)) on the vector of nonlinear feedback terms (ρ_(t){circle around (x)}x_(t); ρ_(t){circle around (x)}u_(t)).

The LTI nonlinear feedback representation can solve a major barrier to applying existing subspace identification algorithms to the identification of LPV systems and overcomes previous problems with exponentially growing numbers of nonlinear terms used in other methods. For example, the above LTI nonlinear feedback representation can make it clear that nonlinear terms (ρ_(t){circle around (x)}x_(t); ρ_(t){circle around (x)}u_(t)) can be interpreted as inputs to an LTI nonlinear feedback system. Therefore it may be possible to directly estimate the matrices of the LTI system state space equations using linear subspace methods that can be accurate for processes with inputs and feedback. This can directly involve the use of the outputs y_(t) as well as augmented inputs [u_(t);(ρ_(t){circle around (x)}x_(t);ρ_(t){circle around (x)}u_(t))] of the LTI nonlinear feedback system.

In another exemplary embodiment, LTI system matrices and state vectors may be determined following the reduction of an LTI subsystem of a nonlinear feedback system involving known scheduling functions and the state of the LTI subsystem. This embodiment can involve taking the iterative determination of both the LTI system state as well as the LTI state space matrices describing the LTI system.

One example may be to consider the polynomial system as a linear system in x and u with additional additive input terms in the higher order product terms so the additional inputs are ρ_(t){circle around (x)}x_(t) and ρ_(t){circle around (x)}u_(t). The scheduling variables p_(t) are assumed to be available in real time as operating points or measured variables. If accurate estimates of the state x_(t) are also available, then the problem could be only a direct application of the iterative algorithm for system identification. Since the variables x_(t) are not available until after the solution of the system identification, a different approach may be utilized.

Thus, in an exemplary first step, an initial estimate of the state vector may be made. Here, system identification may be performed on the terms in the state equations involving the variables x_(t), u_(t) and ρ_(t){circle around (x)}u_(t) but not the variables ρ_(t){circle around (x)}x_(t). From this an approximation of the linear time invariant (LTI) part of the system giving estimates of A₀, B₀, C₀, D₀, B_(ρ), and D_(ρ) as well as estimates for the state vectors X_(1,N)[1]=[x_(t) ^(T) x₂ ^(T) . . . x_(N) ^(T)]^(T) may be obtained.

Then, in an exemplary second step, an iterate estimate of the state vectors may be made. Here the state vector X_(1,N) ^([1]) can be used as an initial estimates for x_(t) in the terms ρ_(t){circle around (x)}x_(t) in equations 15 and 16. Then the iterative algorithm can be applied to obtain an estimate of the system matrices A, B, C, and D and a refined estimate of X_(1,N) ^([2]). Further, this step may then be iterated until a desired convergence is achieved.

Exemplary steps one through three above may therefore work with only a few iterations. Thus, in an exemplary manner in which the iterative algorithm can be used to address the previously known problem of LPV system identification. Thus, the following is an exemplary discussion of using the iterative algorithm in directly identifying the coefficients F_(ij) and H_(ij) of the additive polynomial expansions of the nonlinear difference equation functions f(x_(t),u_(t),v_(t)) and h(x_(t),u_(t),v_(t)), respectively. This may be a very compact and parsimonious parameterization for such a nonlinear system. The iterative algorithm described herein for linear time-invariant systems can therefore be used with only a very modest increase in computational requirements. Further, this exemplary use of the iterative algorithm to directly treat additive nonlinear terms in the state equations involving the state vector such as ρ_(t){circle around (x)}x_(t) as additional inputs. Since the state x_(t) may not be initially known, this may facilitate iteration to estimate the state sequence starting with only the linear time invariant (LTI) part of the system, A₀, B₀, C₀ and D₀.

As seen in the above exemplary discussion following equations 13 and 14, a linear parameter varying system that can be affine in the scheduling variables ρ_(t) can be expressed in time invariant form involving the additional input variables ρ_(t){circle around (x)}x_(t) and ρ_(t){circle around (x)}u_(t). Note that this involves nonlinear functions of ρ_(t) with x_(t and u) _(t). The dynamic system can be linear time-invariant in these nonlinear functions of the variables.

In a further exemplary embodiment, the effect of additional inputs can be traced through the iterative algorithm. The exemplary steps outlined below may further be reflected in the table shown in exemplary FIG. 5 and the flow chart of exemplary FIG. 6.

Using elements 502 of exemplary FIG. 5 and in step 602 of exemplary FIG. 6, an ARX model may be fitted to the observations on iteration k with outputs y_(t) and inputs [u_(t) ^(T)(ρ_(t){circle around (x)}{circumflex over (x)}_(t) ^([k−1]))^(T)(ρ_(t){circle around (x)}u_(t))^(T)]^(T), where for k=1 the term ρ{circle around (x)}{circumflex over (x)}_(t) ^([k−1]) may be absent from the inputs and further can be the equivalent to removing the term involving (A_(ρ)C_(ρ)) from the LPV model structure. The ARX model fitting can have a linear regression problem that makes no prior assumptions on the ARX order other than the maximum ARX order considered. If the identified ARX order is near a maximum considered, the maximum ARX order considered can be doubled and the regression recomputed.

In exemplary step 604 a corrected future can be computed. The effect of future inputs on future outputs can be determined using the ARX model and subtracted from the outputs. The effect of this can be to compute the future outputs that could be obtained from the state at time t if there were no inputs in the future of time t.

In exemplary step 606, a canonical variate analysis (CVA) can be made or computed. Here, the CVA between the past and the corrected future can be computed. Again, the covariance matrices among and between the past and corrected future may also be computed. This may be similar to an SVD on the joint past-future covariance matrix which is of the order of the covariance of the past to obtain the ARX model. A result of this step is to obtain estimates of the states of the system called ‘memory’.

In exemplary step 608, a regression using the state equation may be performed. The ‘memory’ from step 606 can be used in the state equations as if it were data and resulting estimates of the state space matrices and covariance matrices of the noise processes can be obtained. These estimates can be asymptotically ML estimates of the state space model with no prior assumptions on the parameter values of the ARX or SS model.

Thus, using the above-described methodology, the ML solution in the iterative algorithm can be obtain in one computation based on the assumed outputs and inputs in iteration k, as shown in 504 of exemplary FIG. 5, with no iteration on assumed parameter values. The iteration is the result of refinement of the state estimate in the nonlinear term (ρ_(t){circle around (x)}{circumflex over (x)}_(t) ^([k−1]))^(T) that can be part of the assumed data in the iteration k (504).

Referring back to step 602, the ARX model fitting, the dimension of the augmented outputs is increased from dimu to dimua=dimu+dimp(dimx+dimu). To fit the ARX model, the ARX order lagp identified can be substantially higher due to the nonlinear input terms and depending on the statistically significant dynamics present among the output and augmented input variables. The computation can involve computation of an SVD on the data covariance matrix that is dimension of the dimua*lagp where lagp is the maximum ARX order considered. The computation that may be utilized for the SVD is of the order of 60*(dimua*lagp)³, so the computation increases proportional to (dimp(dimx+dimu)/dimu)³.

Therefore, one consequence of augmenting the system inputs by the nonlinear terms ρ_(t){circle around (x)}x_(t) and ρ_(t){circle around (x)}u_(t) may be to increase the past by a factor dimp(dimx+dimu)/dimu, and to increase the computation by this factor cubed. Depending on the particular dimensions of u_(t), x_(t), and ρ_(t), this can be very significant, however there is no exponential explosion in the number of terms or the computation time. The LPV subspace algorithm of this invention still corresponds to subspace system identification for a linear time-invariant system and, in addition, because of the nonlinearity of the terms [u_(t) ^(T)(ρ_(t){circle around (x)}x_(t))^(T) (ρ_(t){circle around (x)}u_(t))^(T)]^(T) involved the state estimates X_(t), iteration on the estimate of the system states until convergence can be desired or, in some alternatives, required.

Another factor to be considered is that the result of the CVA in exemplary step 606 is the computation of an estimate, denoted as m_(t), of the state sequence x_(t). The symbol ‘m’ is used as in the word ‘memory’ to distinguish it from the actual state vector and various estimates of the state vector as used in the EM algorithm discussed below. The estimate {circumflex over (m)}_(t) in combination with the maximization step of the EM algorithm can be shown to be a maximum likelihood solution of the system identification problem for the case of a linear time-invariant system. In that case, the global ML solution can be obtained in only one iteration of the algorithm. This may be different with the EM and gradient search methods that almost always utilize many iterations. The CVA estimate m_(t) may actually be an estimate of the state sequence for the system with parameters equal to the maximum likelihood estimates in the case of LTI systems and large sample size. This is different from the usual concept of first obtaining the ML parameter estimates and then estimating the states using a Kalman filter. Not only is the optimal state sequence for the ML parameters obtained in a single iteration, the optimal state order may also be determined in the process. In the usual ML approach, it can be desired or, in some alternatives, required to obtain the ML estimates for each choice of state order and then proceed to hypothesis testing to determine the optimal state order.

In another exemplary embodiment, the convergence of the iterative algorithm for the case of LPV may be described. In this embodiment, a substantially similar approach may be taken for other forms of nonlinear difference equations.

For any iterative algorithm, two issues may typically arise: (1) does the algorithm always converge, and (2) at what rate does it converge. In the computational examples considered, the iteration k on the estimated states {circumflex over (x)}_(t) ^([k]) can be very rapid, for example half a dozen steps. Thus it may be shown herein that the iterative algorithm can be closely related to the class of EM algorithms that can be shown to always converge, under an assumption on the LPV system stability. Also, the rate of convergence can be computed to be geometric. This latter result is unique since the EM algorithm typically makes rapid early progress but becomes quite slow in the end. The reason for the rapid terminal convergence of the LPV algorithm will be discussed in further detail. Issues of initialization, stability and convergence will be elaborated below.

Additionally, the methods and systems described herein may be discussed in the context of the EM algorithm as there can be some parallelism between the two. To show the convergence of the LPV algorithm, the development in Gibson and Ninness (2005), denoted as GN below and incorporated by reference herein in its entirety (S. Gibson and B. Ninness, “Robust maximum-likelihood estimation of multivariable dynamic systems,” Automatica, vol. 41, no. 5, pp.1667-1682, 2005) will be discussed with various modifications made that are appropriate for the LPV algorithm. All equation numbers from Gibson and Ninness (2005) will include GN following the number in the paper GN, for example (23GN).

In this exemplary embodiment, the replacements that can be made in the GN discussion to obtain the LPV algorithm may be as follows: replace the LTI state equations with the LPV state equations and, for the missing data, replace the state estimate from the Kalman smoother with the ‘memory’ vector m_(t) in the iterative algorithm. The consequence of this can be significant because for linear systems as in GN the iterative algorithm described herein can obtain the global ML parameter estimates in one step in large samples. On the other hand, for linear systems it may take the EM algorithm many iterations to obtain the ML solution.

Replacing the LTI model with the LPV model replaces the state space model equation 11GN by the LPV equations (15) and (16). This can produce a number of modifications in the equations of Lemma 3.1 since u_(t) and (A, B, C, D) in GN is replaced by [u_(t) ^(T)(ρ_(t){circle around (x)}x_(t))^(T) (ρ_(t){circle around (x)}u_(t))^(T)]^(T) and (A₀, [B₀A_(ρ)B_(ρ)],C₀, [B₀C_(ρ)D_(ρ)], ) respectively. So the ‘data’ includes the vector p_(t){circle around (x)}x_(t) where the state vector x_(t) is not available.

To execute iteration k as shown in 504 of exemplary FIG. 5, a straight forward approach involving the EM algorithm could be used. The ‘missing data’ would include state vector x_(t) as the missing data. (See S. Gibson, A. Wills, and B. Ninness (2005), “Maximum likelihood parameter estimation of bilinear systems”, IEEE Trans. Automatic Control, Vol. 50, No. 10, pp. 1581-2005, the contents of which are hereby incorporated by reference in their entirety). Then we could proceed for the case of a bilinear system that involves a term u_(t){circumflex over (x)}x_(t) that may be completely analogous to ρ_(t){circumflex over (x)}x_(t) for the LPV algorithm. But such an EM approach can, on occasion, result in the typical slow convergence behavior of EM algorithms near the maximum.

Therefore, instead the LPV algorithm may be used and can result in a rapid convergence. Thus, in this exemplary embodiment, instead of specifying the ‘missing data’ as the estimate of the Kalman smoother, the subspace approach can specify the CVA state estimate m_(t) or ‘memory’, as the missing data. The memory m_(t) is the estimate of the state vector obtained by a canonical variate analysis between the corrected future and the past obtained in exemplary step 606 of the iterative algorithm using the input and output vectors specified in FIG. 5. This may be similar to a Kalman filter state estimate at the global ML parameter estimates associated with the output and input data at iteration k rather than a Kalman smoother state estimate at the last estimated parameter value. One difference can occur because the CVA method expresses the likelihood function in terms of the corrected future conditioned on the past so that estimates of memory m_(t) may depend only on the output and input data and their distribution depends on the ML estimates associated with the output and input data used in iteration k rather than smoothed estimates of the state. The actual conditional likelihood function p_(e)(ξ_(t)/z_(t)) of (26GN) with ξ_(t) ^(T)=[x_(t+1) ^(T)y_(t) ^(T)] and z_(t) ^(T=[x) _(t) ^(T)u_(t) ^(T)] is what can be involved in all of the EM computations. This can be the same likelihood function involved in the exemplary step 608 of the CVA algorithm of estimating the parameters of the state space equation as in Lemma 3.3 of GN. A difference is that in the exemplary step 608 of the CVA algorithm the expectation can be with respect to the true global ML estimates associated with the output and input data at iteration k whereas the GN estimate is an expectation with respect to the parameter value obtained in the previous iteration.

Further, the use of the LPV model and the choice of memory m_(t) as the missing data in the expectation step of the algorithm can have the following consequence in GN. The basic theory in section 2, The Expectation-Maximization (EM) Algorithm of GN, needs no modification. In Section 3 of GN, the missing data is taken to be the CVA state estimates {circumflex over (m)}_(t) ^([k]) based on the input and output quantities for iteration k in 504 of FIG. 5. So the x_(t) in equations (22GN) through (28GN) can be replaced by {circumflex over (m)}_(t) ^([k]).

Thus, Lemma 3.1 of GN holds but also can achieve the global ML estimate associated with the input-output vectors of exemplary FIG. 5 in one step. Lemma 3.2 of GN can be replaced by the iterative algorithm to obtain the memory estimates {circumflex over (m)}_(t) ^([k]). Lemma 3.3 of GN is the same result as obtained in the iterative algorithm. An additional step may be used to compute {circumflex over (x)}_(t) ^([k]) from the estimates ⊖^([k]) and the linear time-varying state equations given by the LPV state equations. This step may be desired to obtain the state estimate {circumflex over (x)}_(t) ^([k]) for starting the next iteration k₁. This step can further allow for the iterative algorithm to produce Q(θ,θ^(l))≧Q(θ^(l),θ^(l)) with equality if and only if {circumflex over (x)}_(t) ^([k])={circumflex over (x)}_(t) ^([k+1]).

Then, in principle, the memory {circumflex over (m)}_(t) ^([k]) could be used in place of the state estimate {circumflex over (x)}_(t) ^([k]). In fact, {circumflex over (m)}_(t) ^([k]) projected on the recursive structure of the state equations in equations (43GN) and (44GN) can produce the ML state estimates asymptotically and the corresponding optimal filtered state estimates {circumflex over (x)}_(t) ^([k]). Thus the use of {circumflex over (m)}_(t) ^([k]) instead of {circumflex over (x)}_(t) ^([k]) in the computation of ρ_(t){circle around (x)}{circumflex over (x)}_(t) ^([k]) as part of the ‘input data’ can lead to essentially the same result except for some ‘noise’ in the results. But the computational noise can be avoided by the small additional computation of {circumflex over (x)}_(t) ^([k]).

Also, in some exemplary embodiments, there may not be a need for a ‘robust’ improvement to the iterative algorithm described herein since it has been developed using primarily singular value decomposition computations to be robust and demonstrated as such for more than a decade. An exception is possibly the computation of the filtered state estimate {circumflex over (x)}_(t) ^([k]) that could be implemented using the square root methods of Bierman (G. J. Bierman, Factorization Methods for Discrete Sequential Estimation, Academic Press (1977); republished by Dover, New York (2006), the contents of which are hereby incorporated by reference in their entirety) if ill-conditioned LPV dynamic systems are to be solved to high precision.

In still another exemplary embodiment, it may be demonstrated that the LPV system identification algorithm may converge at a geometric rate near the maximum likelihood solution. Here the result for a linear system can be developed. The same approach may work for an LPV system, but the expressions below may be time dependent and can be of greater complexity.

Further, in order to simplify the derivation herein, the time invariant feedback from equations 17 and 18 can be considered with the substitution of notation (Ã,{tilde over (B)},{tilde over (C)},{tilde over (D)},ũ_(t),u_(t)) by (A,B,C,D,u_(t),ũ_(t)). Thus, equation 17 and equation 18, with noise v_(t) in innovation form may be written as equations 32 and 33 below.

x _(t+1) =Ax _(t) +Bu _(t) +Kv _(t)   (32)

y _(t) =Cx _(t) +Du _(t) +v _(t)   (33)

Thus, from the above, it may be seen that for measurements of outputs y_(t) and inputs u_(t)=[ũ_(t) ^(T)(ρ_(t){circle around (x)}x_(t))^(T)]^(T), the time-invariant matrices are (A, B, C, D)=(A₀,{B₀A_(ρ)B_(ρ)],C₀,[B₀C_(ρ)D_(ρ)]), respectively. Then, solving for v_(t) in equation 33 and substituting in equation 32 produces equation 34 below.

x _(t+1) =Ax _(t) +Bu _(t) +K(y _(t) −Cx _(t) +Du _(t))=(A−KC)x _(t)+(B−KD)u _(t) +Ky _(t)   (34)

Next, through recursively substituting the right hand side of equation 34 for x_(t) can provide equation 35 below.

$\begin{matrix} \begin{matrix} {x_{t} = {\sum\limits_{i = 1}^{\infty}{\left( {A - {KC}} \right)^{i - 1}\left\lbrack {{\left( {B - {KD}} \right)u_{t - i}} + {Ky}_{t - i}} \right\rbrack}}} \\ {= {{Jp}_{t}\left( {y_{t},\left\lbrack {{{\overset{\sim}{u}}_{t}\left( {\rho_{t} \otimes x_{t}} \right)},\left( {\rho_{t} \otimes {\overset{\sim}{u}}_{t}} \right)} \right\rbrack} \right)}} \end{matrix} & \left( {{Equation}\mspace{14mu} 35} \right) \end{matrix}$

In equation 35, J can contain the ARX coefficients and p_(t) can be the past output y_(t) and inputs u_(t)=[ũ_(t),(ρ_(t){circle around (x)}x_(t)),(ρ_(t){circle around (x)}ũ_(t))]. Here ũ_(t) can be the original inputs such that u_(t) can include the nonlinear Kronecker product terms.

In a further exemplary embodiment, asymptotically for a large sample, J can be arbitrarily close to constant for a sufficiently large iteration k such that

$\begin{matrix} {{\Delta \; x_{t}^{\lbrack k\rbrack}} = {x_{t}^{\lbrack{k + 1}\rbrack} - \; x_{t}^{\lbrack k\rbrack}}} \\ {= {J\; {p_{t}\left\lbrack {0,{0\left( {\rho_{r} \otimes \left( \; {x_{t}^{\lbrack k\rbrack} - \; x_{t}^{\lbrack{k - 1}\rbrack}} \right)} \right)0}} \right\rbrack}}} \\ {= {L\begin{bmatrix} {{\rho_{t - 1} \otimes \Delta}\; x_{t - 1}^{\lbrack{k - 1}\rbrack}} \\ \vdots \\ {{\rho_{t - {lag}} \otimes \Delta}\; x_{t - {lag}}^{\lbrack{k - 1}\rbrack}} \end{bmatrix}}} \\ {{= {L_{t}\begin{bmatrix} {\Delta \; x_{t - 1}^{\lbrack{k - 1}\rbrack}} \\ \vdots \\ {\Delta \; x_{t - {lag}}^{\lbrack{k - 1}\rbrack}} \end{bmatrix}}},} \end{matrix}$

where L_(t) can be time varying and can have the time varying scheduling parameters ρ_(t) combined with terms L. Also, if ρ_(t) for all t is bounded, L_(t) will be similarly bounded.

Now, when writing the state sequence in block vector form as X_(1:N) ^(i)=vec[x_(1:N) ^(i) . . . x₁ ^(i)] where the vec operation can stack the columns of a matrix starting with left hand columns on top, the above result can imply that

${\begin{bmatrix} {\Delta \; x_{N}^{\lbrack k\rbrack}} \\ \vdots \\ {\Delta \; x_{t - k}^{\lbrack k\rbrack}} \\ \vdots \end{bmatrix} = {\begin{bmatrix} 0 & L_{N - 1} &  &  \\ 0 & \ddots & \ldots & \ldots \\ 0 & 0 & L_{N - k} & \rightarrow \\ 0 & 0 & 0 & \ddots \end{bmatrix}\begin{bmatrix} {\Delta \; x_{{t\; N} - 1}^{\lbrack{k - 1}\rbrack}} \\ \vdots \\ {\Delta \; x_{N - 1}^{\lbrack{k - 1}\rbrack}} \\ \vdots \end{bmatrix}}},$

where → can mean that L_(N−k) extends to the right. Using M to denote the upper triangular matrix can give the fundamental expression for the difference ΔX_(lag:N) ^(i) between state sequences X_(lag:N) _(i) ^(i) at successive iterations k and k−1 as equation 36 below.

ΔX _(lag+2:N) ^(i) =MΔX _(lag+1:N−1) ^(i−1)   (36)

From equation 36 it may then be seen that the terminal convergence rate of the iteration in the LPV case can be governed by M, in particular the largest singular value of M. The various blocks of M can thus be computed by

${\left( {A - {KC}} \right)^{i - 1}{\left( {B - {KD}} \right)_{\rho \otimes x}\begin{bmatrix} {{\rho_{N - 1}(1)}I_{\dim \; x}} \\ \vdots \\ {{\rho_{N - i}\left( {\dim \; p} \right)}I_{\dim \; x}} \end{bmatrix}}},$

where the subscript ρ{circle around (x)}x means to select the submatrix of B−KD with columns corresponding to the corresponding rows of ρ{circle around (x)}x in ũ_(t).

In some other exemplary embodiments, the convergence of the iterative linear subspace computation may be affected by the stability of the LPV difference equations and, more specifically, the stability of the LPV linear subspace system identification described herein. Because a set of time-invariant linear state space difference equations may be stable if and only if all of the eigenvalues of the state transition matrix are stable, for example the eigenvalues are less than 1. The LPV case is more complex, but for the purposes of this exemplary embodiment, the rate of growth or contraction per sample time can be given for each eigenvector component of the state vector x_(t) by the respective eigenvalues of the LPV state transition matrix from equation 9 and now shown as equation 37 below.

$\begin{matrix} {{A\left( \rho_{t} \right)} = {A_{0} + {\sum\limits_{i = 1}^{s}{{\rho_{t}(i)}A_{i}}}}} & \left( {{Equation}\mspace{14mu} 37} \right) \end{matrix}$

In equation 37, the matrices A=(A₀ A_(ρ))=(A₀ A₁ . . . A_(s)) can be assumed to be unknown constant matrices. Therefore, it is apparent that the transition matrix A(ρ_(t)) may be a linear combination [1; ρ_(t)] of the matrices A_(i) for 0≦i≦s. Therefore, for any choice of the matrices A_(i) for 0≦i≦s, there can be possible values of ρ_(t) that could produce unstable eigenvalues at particular sample times t. If or when this occurs only sporadically and/or for only a limited number of consecutive observation times, no problems or errors may arise.

In some other exemplary embodiments, for example in the k-th iterative computation of the subspace algorithm, only estimated values of the state sequence {circumflex over (x)}_(t) ^([k]) and the matrices Â_(i) ^([k]) for 0≦i≦s may be used. Therefore, if large errors in the estimates of these quantities are possible or acceptable, then there can be a greater potential for unstable behavior. For example, there may be areas of application of more significant importance, such as including identification of aircraft wing flutter dynamics where the flutter dynamics may be marginally stable or even unstable with the vibration being stabilized by active control feedback from sensors to wing control surfaces. Then, in some other applications, it may be possible to guarantee that not any combination of scheduling parameters ρ_(t), values of matrices A_(i) and uncertainty could produce instabilities. Further to this, the stability of the transition matrix can provide a potential for issues and therefore further consideration could be desired.

In some further exemplary embodiments, instabilities can produce periods where the predicted system response can rapidly grow very large. For example, when the eigenvalues are all bounded less than 1, then the predicted system response can be bounded, whereas if estimation errors are large, for example, one of the eigenvalues is equal to 2 for a period of time, then the predicted response may double approximately every sample during that time. Thus if 10³⁰=(10³)¹⁰≅(2¹⁰)¹⁰=2¹⁰⁰, then in as few as 100 samples, 30 digits of precision could be lost in the computation which could then provide meaningless results.

Therefore, per the above, there may be conditions under which there may be considerable loss of numerical accuracy that can be associated with periods where the LPV transition matrix is unstable, for example with extended intervals of time. Further, the difficulty can lie in that the algorithm initialization and sample size since at the optimal solution with a large sample, the algorithm is stable and convergent. If it was possible to compute with infinite precision, then problems with illconditioning could be avoided; however, with 15 or 30 decimal place accuracy, for example, some real data sets such as for the aircraft wing flutter, can benefit from further consideration.

Therefore, some exemplary embodiments may deal with manners of correcting for or otherwise lessening any undesired effects that may result from algorithm instability. For example, if the state sequence {circumflex over (x)}_(t) ^([k]) is sufficiently close to the optimum as based upon the terminal convergence results described previously, the iterative algorithm may be stable provided it is assumed that the LPV system is stable. Therefore, in some examples, large initial errors in the estimate {circumflex over (x)}_(t)[1] can lead to an unstable computation.

In other examples, such as during time intervals when the scheduling parameter values cause significant instabilities in A(ρ_(t)), some components of the f_(t)|q_(t), the future outputs f_(t) corrected for the effects of future inputs q_(t), may become quite large. Therefore, in the computation of the covariance matrix of f_(t)|q_(t), such large values can cause considerable illconditioning and loss in numerical precision.

In examples where the iterative algorithm and some other subspace methods permit the arbitrary editing of the times where the instabilities occur, outlier editing of unstable regions may be performed. For example, in the ARX covariance computation where the corrected future can be computed, the time intervals with significant instabilities can be determined and removed from the computation.

In examples dealing with experimental design, if the trajectory of the operating point variables can be specified, then the scheduling parameters can be scheduled to enhance the system identification in several ways. An initial region that can avoid computational instabilities can be chosen to obtain sufficiently accurate initial estimates of the LPV parameters (A, B, C, D). This can then be used in other regions to initialize the state estimate of the algorithm with sufficient accuracy that computational instabilities will not be encountered.

In the above examples, the removal of unstable outliers at each iteration can be the most general and robust procedure. As the estimated values of the state sequence {circumflex over (x)}_(t) ^([k]) and LPV parameters improve with more iterations, the number of outliers can be expected to decrease until there is rapid terminal convergence. A counter example to this expectation is when the beginning of the scheduling parameters ρ_(t) time history has little variation so that the LPV model for this part of the data is good for that portion of the data, but is a poor global model. Then, in the later part of the time history, there can be consideration variation in ρ_(t) such that unstable behavior may result.

Additionally, in many potential exemplary applications of the methods and systems described herein, it can be expected that the proposed algorithm can perform much better than existing methods that presently are not feasible on industrial problems. Further, in many situations, it can be desired to design the experiment to obtain results of a desired fidelity for a specified global region of the operating space at as little cost in time and resources as possible. Because the iterative algorithm's linear subspace method is a maximum likelihood based procedure, designs can be developed for LTI system identification. Also, as it identifies a stochastic model with estimated disturbance models, including confidence bands on quantities such as dynamic frequency response functions, the required sample size and system input excitation can be developed with little prior information on the disturbance processes.

In some further exemplary embodiments, the LPV methods and systems described herein may be extended to nonlinear systems. For example, it may be shown that a number of complex and nonlinear systems can be expressed in an approximate LPV form that can be sufficient for application of the LPV subspace system identification methods described herein.

In one example, a general nonlinear, time varying, parameter varying dynamic system can be described by a system of nonlinear state equations, such as those shown in equations 38 and 39.

x _(t+1) =f(x _(t) u _(t),ρ_(t) ,v _(t))   (38)

y _(t) =h(x _(t) ,u _(t),ρ_(t) ,v _(t))   (39)

In equations 38 and 39, x_(t) can be the state vector, u_(t) can be the input vector, y_(t) can be the output vector and v_(t) can be a white noise measurement vector. In some exemplary embodiments, to deal with ‘parameter varying’ systems, the ‘scheduling’ variables ρ_(t) that can be time-varying parameters can describe the present operating point of the system. Very general classes of functions f(·) and h(·) can be represented by additive borel functions that need not be continuous.

In a simplified manner, the case of functions admitting Taylor expansion as in Rugh Section 6.3 (W. J. Rugh, Nonlinear System Theory: The Volterra/Wiener Approach. Baltimore, Md.: Johns Hopkins Univ. Press, 1981, the contents of which are hereby incorporated by reference in their entirety), where ρ_(t) and v_(t) may be absent can provide the following, as shown in equations 40 and 41.

$\begin{matrix} {x_{t + 1} = {\sum\limits_{i = 0}^{I}{\sum\limits_{j = 0}^{J}{F_{ij}x_{t}^{(i)}u_{t}^{(j)}}}}} & \left( {{Equation}\mspace{14mu} 40} \right) \\ {y_{t} = {\sum\limits_{i = 0}^{I}{\sum\limits_{j = 0}^{J}{H_{ij}x_{t}^{(i)}u_{t}^{(j)}}}}} & \left( {{Equation}\mspace{14mu} 41} \right) \end{matrix}$

In equations 39 and 40, the notation x_(t) ^((i)) can be defined recursively as x_(t) ^((i))=x_(t){circle around (x)}x_(t) ^((i−1)) and similarly for u_(t) ^((j)).

Thus, equations 40 and 41 may be polynomial expansions of the nonlinear functions f(·) and h(·). Note that the nonlinear equations may involve nonlinear functions of relatively simple form such as the approximating polynomial equations that involve only sums of products that are readily computed for various purposes. However for empirical estimation of the coefficients in the presence of autocorrelated errors, the problem can become difficult for low dimensions of y, u, and x, even using subspace methods. For subspace methods, the matrix dimensions can grow exponentially with the dimension of the ‘past’ that can be used. This can occur in expanding equation 40 by repeated substitution into x_(t) ^([k]) on the right hand side of the state equation 40 with x_(t) on the left hand side of equation 40. Thus the entire right hand side of equation 40 with t replaced by t−1 can be raised to the power lagp, the order of the past typically selected as the estimated ARX model order. For a relatively low order past, and low dimensions of x_(t) and u_(t), the number of additive terms can increase exponentially.

However, in a further exemplary embodiment and following Rugh the equations 39 and 40 can be converted through Carleman bilinearization to bilinear vector differential equations in the state variable as shown in equation 42:

x _(t) ^({circle around (x)}) =[x _(t) ⁽¹⁾ ;x _(t) ⁽²⁾ ; . . . x _(t) ^((I))];   (42)

and the input power and products variables as shown in equation 43.

u _(t) ^({circle around (x)}) =[u _(t) ⁽¹⁾ ;u _(t) ⁽²⁾ ; . . . u _(t) ^((J))];   (43)

In a further exemplary embodiment, equation 40, which expresses the state-affine form, can then be rewritten as equations 44 and 45, below.

x _(t) ^({circle around (x)}) =A(x _(t) ^({circle around (x)}) {circle around (x)}u _(t) ^({circle around (x)}))+Bu _(t) ^({circle around (x)})  (44)

x _(t) ^({circle around (x)}) =C(x _(t) ^({circle around (x)}) {circle around (x)}u _(t) ^({circle around (x)}))+Du _(t) ^({circle around (x)})  (45)

Next, it may be assumed that the coefficient matrices (A, B, C, D) may be linear functions of scheduling parameters ρ_(t) denoted as (A(ρ_(t)),B(ρ_(t)),C(ρ_(t))D(ρ_(t))) and in state-affine form as in equations 9 through 12. The scheduling parameters ρ_(t) may be nonlinear functions of the operating point or other known or accurately measured variables. For example, since the inputs u_(t) ^({circle around (x)}) are multiplicative and can be assumed to be known in real time or accurately measured, they can be absorbed into the scheduling parameters ρ_(t), thereby possibly decreasing their dimension. The bilinear equations can then become those shown in equations 46 and 47 below.

x _(t) ^({circle around (x)}) =A(ρ_(t))x _(t) ^({circle around (x)}) +B(ρ_(t))u _(t) ^({circle around (x)})  (46)

x _(t) ^({circle around (x)}) =C(ρ_(t))x _(t) ^({circle around (x)}) +D(ρ_(t))u _(t) ^({circle around (x)})  (47)

Equations 46 and 47 are in explicitly LPV form. Therefore, for functions f(x_(t),u_(u),ρ_(t),v_(t)) and h(x_(t),u_(u),ρ_(t),v_(t)) that may be sufficiently smooth and for which there may exist a well defined power series expansion in a neighborhood of the variable, there can exist an LPV approximation of the process.

Thus, with the absorption of the inputs u_(t) ^({circle around (x)}) into the scheduling parameter ρ_(t), the equations may be linear time varying and, in some instances, may have a higher dimension ρ_(t). This can show the dual roles of inputs and scheduling parameters and how they may be interchangeable in a variety of manners and, for example, their difference may be their rate of variation with time. The knowledge of scheduling can therefore often be derived from the underlying physics, chemistry or other fundamental information. LPV models can therefore be considered, on occasion, as graybox models insofar as they may be able to incorporate considerable global information about the behavior of the process that can incorporate into the model how the process dynamics can change with operating point.

The foregoing description and accompanying figures illustrate the principles, preferred embodiments and modes of operation of the invention. However, the invention should not be construed as being limited to the particular embodiments discussed above. Additional variations of the embodiments discussed above will be appreciated by those skilled in the art.

Therefore, the above-described embodiments should be regarded as illustrative rather than restrictive. Accordingly, it should be appreciated that variations to those embodiments can be made by those skilled in the art without departing from the scope of the invention as defined by the following claims. 

What is claimed is:
 1. A method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions, comprising: sensing, with a sensor, dynamic system data; storing sensed dynamic system data on a memory; performing at least one of a nonlinear and linear autoregressive with inputs ((N)ARX) model fitting with one of a predetermined autoregressive with inputs (ARX) model and a predetermined nonlinear ARX (NARX) model to the stored dynamic system data, the fitting comprising: performing a parameter estimation using stored dynamic system input and output data determined from a predetermined iteration of an algorithm for subspace identification with a processor, at least one of a set of ARX models of increasing order with a specified maximum order or a set of linear regression problems in terms of NARX models of increasing order and monomial degrees with a specified maximum order and degree, comprising; performing a model comparison, with a processor, to compute an Akaike's Information Criterion (AIC) of model fits for at least each ARX order and each NARX order and degree; selecting a model that minimizes the AIC for at least one of a set of predetermined ARX models with a minimum AIC and a set of predetermined NARX models with a minimum AIC, wherein if more than one model achieves the desired minimum AIC, then selecting the ARX model or NARX model that further minimizes the number of estimated parameters that is also computed in the AIC computation; performing a state space model fitting of a state space dynamic model of dynamic system operation that is parametric in its scheduling parameters, with a processor, using the ARX or NARX model selected as minimizing AIC, the state space model fitting comprising; performing a corrected future calculation, by a processor, the corrected future calculation determining one or more corrected future outputs of dynamic system data through prediction and subtraction of an effect of one or more future inputs of dynamic system data on future outputs of the algorithm; determining estimates of states with values whose elements are ordered as their predictive correlation for the future by performing a canonical variate analysis (CVA), with a processor, between corrected future outputs and past augmented inputs; selecting one of a state order that minimizes the AIC or the lowest order of state orders that minimize the AIC; inputting the estimates of states into one or more state equations; performing a linear regression calculation on the one or more state equations to determine matrix coefficients of the state equations, and providing a dynamic model of dynamic system data in the form of state equations with linear parameter varying matrix coefficients as functions of the scheduling parameters to extend subspace identification methods to LPV and nonlinear parameter varying systems with general scheduling functions.
 2. The method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions, of claim 1, wherein the ARX model fitting has a linear regression problem that assumes a maximum ARX order to be considered.
 3. The method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions of claim 1, wherein the fitting of the ARX models further comprises: a specified ARX order lagp wherein, for the specified ARX order lagp, and for every time t greater than lagp, the prediction of outputs y_(t) using an autoregression of past outputs y_(t−i) for 0<i<lagp+1, and an exogenous moving average comprising past inputs u_(t−i), and further augmented past inputs   ? ⊗ ? ?indicates text missing or illegible when filed and augmented state estimates   ? ⊗ ? ?indicates text missing or illegible when filed augmented respectively by Kronecker products that are similarly time shifted past scheduling functions ρ_(t−i) for −1<i<lagp+1.
 4. The method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions, of claim 1, wherein the dynamic system is an engine.
 5. The method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions, of claim 4, wherein the engine is an automotive engine.
 6. The method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions, of claim 4, wherein the engine is a turbine engine.
 7. The method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions, of claim 1, wherein the dynamic system is a chemical process.
 8. The method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions, of claim 7, wherein the chemical process occurs in a stirred tank reactor.
 9. The method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions, of claim 7, wherein the chemical process occurs in one or more distillation columns.
 10. The method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions, of claim 1, wherein the dynamic system is an aircraft.
 11. The method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions, of claim 1, wherein the dynamic system is an automobile.
 12. The method of modelling a dynamic system by extending subspace identification methods to linear parameter varying (LPV) and nonlinear parameter varying systems with general scheduling functions, of claim 1, wherein the dynamic system is a ship. 